Scaling prediction errors to reward variability benefits error-driven learning in humans

نویسندگان

  • Kelly M. J. Diederen
  • Wolfram Schultz
چکیده

Effective error-driven learning requires individuals to adapt learning to environmental reward variability. The adaptive mechanism may involve decays in learning rate across subsequent trials, as shown previously, and rescaling of reward prediction errors. The present study investigated the influence of prediction error scaling and, in particular, the consequences for learning performance. Participants explicitly predicted reward magnitudes that were drawn from different probability distributions with specific standard deviations. By fitting the data with reinforcement learning models, we found scaling of prediction errors, in addition to the learning rate decay shown previously. Importantly, the prediction error scaling was closely related to learning performance, defined as accuracy in predicting the mean of reward distributions, across individual participants. In addition, participants who scaled prediction errors relative to standard deviation also presented with more similar performance for different standard deviations, indicating that increases in standard deviation did not substantially decrease "adapters'" accuracy in predicting the means of reward distributions. However, exaggerated scaling beyond the standard deviation resulted in impaired performance. Thus efficient adaptation makes learning more robust to changing variability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling prediction errors to reward variability benefits error - driven learning in humans 1 Running title : Reward prediction error scaling correlates with learning efficiency

4 1 Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge 5 CB2 3DY, United Kingdom 6 7 * Correspondence should be addressed to Kelly M.J. Diederen, Department of Physiology, 8 Development, and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, 9 UK. E-mail: [email protected], Tel: + 44 (0)1223 333754, Fax: + 44 (0)1223 333840 10 1...

متن کامل

Adaptive Prediction Error Coding in the Human Midbrain and Striatum Facilitates Behavioral Adaptation and Learning Efficiency

Effective error-driven learning benefits from scaling of prediction errors to reward variability. Such behavioral adaptation may be facilitated by neurons coding prediction errors relative to the standard deviation (SD) of reward distributions. To investigate this hypothesis, we required participants to predict the magnitude of upcoming reward drawn from distributions with different SDs. After ...

متن کامل

Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum

Learning to optimally predict rewards requires agents to account for fluctuations in reward value. Recent work suggests that individuals can efficiently learn about variable rewards through adaptation of the learning rate, and coding of prediction errors relative to reward variability. Such adaptive coding has been linked to midbrain dopamine neurons in nonhuman primates, and evidence in suppor...

متن کامل

BOLD Responses to Negative Reward Prediction Errors in Human Habenula

Although positive reward prediction error, a key element in learning that is signaled by dopamine cells, has been extensively studied, little is known about negative reward prediction errors in humans. Detailed animal electrophysiology shows that the habenula, an integrative region involved in many processes including learning, reproduction, and stress responses, also encodes negative reward-re...

متن کامل

Reward expectation and prediction error in human medial frontal cortex: An EEG study

The mammalian medial frontal cortex (MFC) is involved in reward-based decision making. In particular, in nonhuman primates this area constructs expectations about upcoming rewards, given an environmental state or a choice planned by the animal. At the same time, in both humans and nonhuman primates, the MFC computes the difference between such predictions and actual environmental outcomes (rewa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 114  شماره 

صفحات  -

تاریخ انتشار 2015